{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# RAPIDS cuML \n",
    "## Performance, Boundaries, and Correctness Benchmarks\n",
    "\n",
    "**Description:** This notebook provides a simple and unified means of benchmarking single GPU cuML algorithms against their skLearn counterparts with the `cuml.benchmark` package in RAPIDS cuML. This enables quick and simple measurements of performance, validation of correctness, and investigation of upper bounds.\n",
    "\n",
    "Each benchmark returns a Pandas `DataFrame` with the results. At the end of the notebook, these results are used to draw charts and output to a CSV file. \n",
    "\n",
    "Please refer to the [table of contents](#table_of_contents) for algorithms available to be benchmarked with this notebook."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import cuml\n",
    "import pandas as pd\n",
    "\n",
    "from cuml.benchmark.runners import SpeedupComparisonRunner\n",
    "from cuml.benchmark.algorithms import algorithm_by_name\n",
    "\n",
    "import warnings\n",
    "warnings.filterwarnings('ignore', 'Expected column ')\n",
    "\n",
    "print(cuml.__version__)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "N_REPS = 3  # Number of times each test is repeated\n",
    "\n",
    "DATA_NEIGHBORHOODS = \"blobs\"\n",
    "DATA_CLASSIFICATION = \"classification\"\n",
    "DATA_REGRESSION = \"regression\"\n",
    "\n",
    "INPUT_TYPE = \"numpy\"\n",
    "\n",
    "benchmark_results = []"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "SMALL_ROW_SIZES = [2**x for x in range(14, 17)]\n",
    "LARGE_ROW_SIZES = [2**x for x in range(18, 24, 2)]\n",
    "\n",
    "SKINNY_FEATURES = [32, 256]\n",
    "WIDE_FEATURES = [1000, 10000]\n",
    "\n",
    "VERBOSE=True\n",
    "RUN_CPU=True"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def enrich_result(algorithm, runner, result):\n",
    "    result[\"algo\"] = algorithm\n",
    "    result[\"dataset_name\"] = runner.dataset_name\n",
    "    result[\"input_type\"] = runner.input_type\n",
    "    return result\n",
    "\n",
    "def execute_benchmark(algorithm, runner, verbose=VERBOSE, run_cpu=RUN_CPU, **kwargs):\n",
    "    results = runner.run(algorithm_by_name(algorithm), verbose=verbose, run_cpu=run_cpu, **kwargs)\n",
    "    results = [enrich_result(algorithm, runner, result) for result in results]\n",
    "    benchmark_results.extend(results)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Table of Contents<a id=\"table_of_contents\"/>\n",
    "\n",
    "### Benchmarks\n",
    "1. [Neighbors](#neighbors)<br>\n",
    "    1.1 [Nearest Neighbors - Brute Force](#nn_bruteforce)<br>\n",
    "    1.2 [KNeighborsClassifier](#kneighborsclassifier)<br>\n",
    "    1.3 [KNeighborsRegressor](#kneighborsregressor)<br>\n",
    "2. [Clustering](#clustering)<br>\n",
    "    2.1 [DBSCAN - Brute Force](#dbscan_bruteforce)<br>\n",
    "    2.2 [K-Means](#kmeans)<br>\n",
    "3. [Manifold Learning](#manifold_learning)<br>\n",
    "    3.1 [UMAP - Unsupervised](#umap_unsupervised)<br>\n",
    "    3.2 [UMAP - Supervised](#umap_supervised)<br>\n",
    "    3.3 [T-SNE](#tsne)<br>\n",
    "4. [Linear Models](#linear_models)<br>\n",
    "    4.1 [Linear Regression](#linear_regression)<br>\n",
    "    4.2 [Logistic Regression](#logistic_regression)<br>\n",
    "    4.3 [Ridge Regression](#ridge_regression)<br>\n",
    "    4.4 [Lasso Regression](#lasso_regression)<br>\n",
    "    4.5 [ElasticNet Regression](#elasticnet_regression)<br>\n",
    "    4.6 [Mini-batch SGD Classifier](#minibatch_sgd_classifier)<br>\n",
    "5. [Decomposition](#decomposition)<br>\n",
    "    5.1 [PCA](#pca)<br>\n",
    "    5.2 [Truncated SVD](#truncated_svd)<br>\n",
    "6. [Ensemble](#ensemble)<br>\n",
    "    6.1 [Random Forest Classifier](#random_forest_classifier)<br>\n",
    "    6.2 [Random Forest Regressor](#random_forest_regressor)<br>\n",
    "    6.3 [FIL](#fil)<br>\n",
    "    6.4 [Sparse FIL](#sparse_fil)<br>\n",
    "7. [Random Projection](#random_projection)<br>\n",
    "    7.1 [Gaussian Random Projection](#gaussian_random_projection)<br>\n",
    "    7.2 [Sparse Random Projection](#sparse_random_projection)<br>\n",
    "8. [SVM](#svm)<br>\n",
    "    8.1 [SVC - Linear Kernel](#svc_linear_kernel)<br>\n",
    "    8.2 [SVC - RBF Kernel](#svc_rbf_kernel)<br>\n",
    "    8.3 [SVR - Linear Kernel](#svr_linear_kernel)<br>\n",
    "    8.4 [SVR - RBF Kernel](#svr_rbf_kernel)<br>\n",
    "    \n",
    "### Chart & Store Results\n",
    "9. [Convert to Pandas DataFrame](#convert_to_pandas)<br>\n",
    "10. [Chart Results](#chart_results)<br>\n",
    "11. [Output to CSV](#output_csv)<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Neighbors<a id=\"neighbors\"/>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Nearest Neighbors - Brute Force<a id=\"nn_bruteforce\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_NEIGHBORHOODS,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS,\n",
    ")\n",
    "\n",
    "execute_benchmark(\"NearestNeighbors\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### KNeighborsClassifier<a id=\"kneighborsclassifier\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_CLASSIFICATION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"KNeighborsClassifier\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### KNeighborsRegressor<a id=\"kneighborsregressor\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_REGRESSION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"KNeighborsRegressor\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Clustering<a id=\"clustering\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### DBSCAN - Brute Force<a id=\"dbscan_bruteforce\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_NEIGHBORHOODS,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"DBSCAN\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### K-means Clustering<a id=\"kmeans\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_NEIGHBORHOODS,\n",
    "    input_type=\"numpy\",\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"KMeans\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Manifold Learning<a id=\"manifold_learning\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### UMAP - Unsupervised<a id=\"umap_unsupervised\"/>\n",
    "CPU benchmark requires UMAP-learn"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=WIDE_FEATURES,\n",
    "    dataset_name=DATA_NEIGHBORHOODS,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"UMAP-Unsupervised\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### UMAP - Supervised<a id=\"umap_supervised\"/>\n",
    "CPU benchmark requires UMAP-learn"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=WIDE_FEATURES,\n",
    "    dataset_name=DATA_NEIGHBORHOODS,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"UMAP-Supervised\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### T-SNE<a id=\"tsne\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES, \n",
    "    dataset_name=DATA_NEIGHBORHOODS,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "# Due to extreme high runtime, the CPU benchmark \n",
    "# is disabled. Use run_cpu=True to re-enable. \n",
    "\n",
    "execute_benchmark(\"TSNE\", runner, run_cpu=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Linear Models<a id=\"linear_models\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Linear Regression<a id=\"linear_regression\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_REGRESSION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"LinearRegression\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Logistic Regression<a id=\"logistic_regression\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_CLASSIFICATION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"LogisticRegression\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Ridge Regression<a id=\"ridge_regression\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_REGRESSION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"Ridge\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Lasso Regression<a id=\"lasso_regression\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_REGRESSION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"Lasso\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### ElasticNet Regression<a id=\"elasticnet_regression\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_REGRESSION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"ElasticNet\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Mini-batch SGD Classifier<a id=\"minibatch_sgd_classifier\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_CLASSIFICATION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"MBSGDClassifier\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Decomposition<a id=\"decomposition\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### PCA<a id=\"pca\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=WIDE_FEATURES,\n",
    "    dataset_name=DATA_NEIGHBORHOODS,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"PCA\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Truncated SVD<a id=\"truncated_svd\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=WIDE_FEATURES,\n",
    "    dataset_name=DATA_NEIGHBORHOODS,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"TSVD\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Ensemble<a id=\"ensemble\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Random Forest Classifier<a id=\"random_forest_classifier\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_CLASSIFICATION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"RandomForestClassifier\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Random Forest Regressor<a id=\"random_forest_regressor\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_REGRESSION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"RandomForestRegressor\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### FIL<a id=\"fil\"/>\n",
    "CPU benchmark requires XGBoost Library"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_CLASSIFICATION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"FIL\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sparse FIL<a id=\"sparse_fil\"/>\n",
    "Requires TreeLite library"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_CLASSIFICATION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"Sparse-FIL-SKL\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Random Projection<a id=\"random_projection\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Gaussian Random Projection<a id=\"gaussian_random_projection\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES,\n",
    "    bench_dims=WIDE_FEATURES,\n",
    "    dataset_name=DATA_NEIGHBORHOODS,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"GaussianRandomProjection\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Sparse Random Projection<a id=\"sparse_random_projection\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES,\n",
    "    bench_dims=WIDE_FEATURES,\n",
    "    dataset_name=DATA_NEIGHBORHOODS,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"SparseRandomProjection\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## SVM<a id=\"svm\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SVC - Linear Kernel<a id=\"svc_linear_kernel\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_CLASSIFICATION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "# Due to extreme high runtime, the CPU benchmark \n",
    "# is disabled. Use run_cpu=True to re-enable. \n",
    "\n",
    "execute_benchmark(\"SVC-Linear\", runner, run_cpu=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SVC - RBF Kernel<a id=\"svc_rbf_kernel\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_CLASSIFICATION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "# Due to extreme high runtime, the CPU benchmark \n",
    "# is disabled. Use run_cpu=True to re-enable. \n",
    "\n",
    "execute_benchmark(\"SVC-RBF\", runner, run_cpu=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SVR - Linear Kernel<a id=\"svr_linear_kernel\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_REGRESSION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "# Due to extreme high runtime, the CPU benchmark \n",
    "# is disabled. Use run_cpu=True to re-enable. \n",
    "\n",
    "execute_benchmark(\"SVR-Linear\", runner, run_cpu=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SVR - RBF Kernel<a id=\"svr_rbf_kernel\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = cuml.benchmark.runners.SpeedupComparisonRunner(\n",
    "    bench_rows=SMALL_ROW_SIZES, \n",
    "    bench_dims=SKINNY_FEATURES,\n",
    "    dataset_name=DATA_REGRESSION,\n",
    "    input_type=INPUT_TYPE,\n",
    "    n_reps=N_REPS\n",
    ")\n",
    "\n",
    "execute_benchmark(\"SVR-RBF\", runner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Charting & Storing Results<a id=\"charting_and_storing_results\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Convert Results to Pandas DataFrame<a id=\"convert_to_pandas\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.DataFrame(benchmark_results)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Chart Results<a id=\"chart_results\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def chart_single_algo_speedup(df, algorithm):\n",
    "    df = df.loc[df.algo == algorithm]\n",
    "    df = df.pivot(index=\"n_samples\", columns=\"n_features\", values=\"speedup\")\n",
    "    axes = df.plot.bar(title=\"%s Speedup\" % algorithm)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def chart_all_algo_speedup(df):\n",
    "    df = df[[\"algo\", \"n_samples\", \"speedup\"]].groupby([\"algo\", \"n_samples\"]).mean()\n",
    "    df.plot.bar()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "chart_single_algo_speedup(df, \"LinearRegression\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "chart_all_algo_speedup(df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Output Results to CSV<a id=\"output_csv\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.to_csv(\"benchmark_results.csv\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python (rapids-env7)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
